Search Result

Select

Ensemble classification model for distributed drifted data streams

YIN Chunyong, ZHANG Guojie

Journal of Computer Applications 2021, 41 (7): 1947-1955. DOI: 10.11772/j.issn.1001-9081.2020081277

Abstract （312）

PDF （1255KB）（275）

Save

Aiming at the problem of low classification accuracy in big data environment, an ensemble classification model for distributed data streams was proposed. Firstly, the microcluster mode was used to reduce the amount of data transmitted from local nodes to the central nodes, so as to reduce the communication cost. Secondly, the training samples of the global classifier were generated by using the sample reconstruction algorithm. Finally, an ensemble classification model for drift data streams was proposed, which adopted the weighted combination strategy of dynamic classifiers and steady classifiers, and the mixed labeling strategy was used to label the most representative instances to update the ensemble model. Experiments on two virtual datasets and two real datasets showed that the model suffered less fluctuation from concept drift compared with two distributed mining models DS-means and BDS-ensemble, and had higher accuracy than Online Active Learning Ensemble model (OALEnsemble), with the accuracy on four datasets improved by 1.58、0.97、0.77 and 1.91 percentage points respectively. Although the memory consumption of this model was slightly higher than those of BDS-ensemble and DS-means models, this model was able to improve the classification performance at a lower memory cost. Therefore, the model is suitable for the classification of big data with distributed and mobility characteristics, such as network monitoring and banking business system.

Reference | Related Articles | Metrics

Select

Sequential multimodal sentiment analysis model based on multi-task learning

ZHANG Sun, YIN Chunyong

Journal of Computer Applications 2021, 41 (6): 1631-1639. DOI: 10.11772/j.issn.1001-9081.2020091416

Abstract （854）

PDF （1150KB）（1205）

Save

Considering the issues of unimodal feature representation and cross-modal feature fusion in sequential multimodal sentiment analysis, a multi-task learning based sentiment analysis model was proposed by combining with multi-head attention mechanism. Firstly, Convolution Neural Network (CNN), Bidirectional Gated Recurrent Unit (BiGRU) and Multi-Head Self-Attention (MHSA) were used to realize the sequential unimodal feature representation. Secondly, the bidirectional cross-modal information was fused by multi-head attention. Finally, based on multi-task learning, the sentiment polarity classification and sentiment intensity regression were added as auxiliary tasks to improve the comprehensive performance of the main task of sentiment score regression. Experimental results demonstrate that the proposed model improves the accuracy of binary classification by 7.8 percentage points and 3.1 percentage points respectively on CMU Multimodal Opinion Sentiment and Emotion Intensity (CMU-MOSEI) and CMU Multimodal Opinion level Sentiment Intensity (CMU-MOSI) datasets compared with multimodal factorization model. Therefore, the proposed model is applicable for the sentiment analysis problems under multimodal scenarios, and can provide the decision supports for product recommendation, stock market forecasting, public opinion monitoring and other relevant applications.

Reference | Related Articles | Metrics

Select

End-to-end adversarial variational Bayes method for short text sentiment classification

YIN Chunyong, ZHANG Sun

Journal of Computer Applications 2020, 40 (9): 2536-2542. DOI: 10.11772/j.issn.1001-9081.2020010048

Abstract （347）

PDF （1653KB）（567）

Save

Concerning the problem of low accuracy in sentiment classification caused by short text, an end-to-end short text sentiment classifier was proposed based on adversarial learning and variational inference. First, the spectrum normalization technology was employed to alleviate the vibration of discriminator in training process. Second, an additional classifier was utilized to guide the updating of the inference model. Third, the Adversarial Variational Bayes (AVB) was used to extract the topic features of the short text. Finally, topic features and pre-trained word vector features were fused by three times of attention mechanism in order to realize the classification. Experimental results on one product review and two micro-blog datasets show that the proposed model improves the accuracy by 2.9, 2.2 and 8.4 percentage points respectively compared to the Bidirectional Long Short-Term Memory network based on Self-Attention (BiLSTM-SA). It can be seen that the proposed model can be applied to mine sentiments and opinions in social short texts, which is significant for public opinion discovery, user feedback, quality supervision and other related fields.

Reference | Related Articles | Metrics

Select

Text classification based on improved capsule network

YIN Chunyong, HE Miao

Journal of Computer Applications 2020, 40 (9): 2525-2530. DOI: 10.11772/j.issn.1001-9081.2019122153

Abstract （1054）

PDF （952KB）（1214）

Save

In order to solve the problems that the pooling operation of Convolutional Neural Network (CNN) will lose some feature information and the classification accuracy of Capsule Network (CapsNet) is not high, an improved CapsNet model was proposed. Firstly, two convolution layers were used to extract local features of feature information. Then, the CapsNet was used to extract the overall features of text. Finally, the softmax classifier was used to perform the classification. Compared with CNN and CapsNet, the proposed model improves the classification accuracy by 3.42 percentage points and 2.14 percentage points respectively. The experimental results show that the improved CapsNet model is more suitable for text classification.

Reference | Related Articles | Metrics

Select

Fake review detection model based on vertical ensemble Tri-training

YIN Chunyong, ZHU Yuhang

Journal of Computer Applications 2020, 40 (8): 2194-2201. DOI: 10.11772/j.issn.1001-9081.2019112046

Abstract （374）

PDF （1099KB）（318）

Save

In view of the problems that fake reviews mislead users and make their interests suffer losses and the cost of large-scale manual labeling reviews is too high, by using the classification model generated in the previous iteration process to improve the accuracy of detection, a fake review detection model based on Vertical Ensemble Tri-Training (VETT) was proposed. In the model, the user behavior characteristics were combined as features based on the review text characteristics to perform feature extraction. In VETT algorithm, the iterative process was divided into two parts:vertical ensemble within the group and horizontal ensemble between groups. In-group ensemble is to construct an original classifier using the previous iterative models of the classifier, and the inter-group ensemble is to train three original classifiers through the traditional process to obtain the second-generation classifiers after this iteration, thereby improving the accuracy of the labels. Compared with Co-training, Tri-training, PU learning based on Area Under Curve (PU-AUC) and Vertical Ensemble Co-training (VECT) algorithms, VETT algorithm has the maximum value of F1 increased by 6.5, 5.08, 4.27 and 4.23 percentage points respectively. Experimental results show that the proposed VETT algorithm has better classification performance.

Reference | Related Articles | Metrics